The search functionality is under construction.

Author Search Result

[Author] Shigeru YAMASHITA(29hit)

21-29hit(29hit)

  • A Systematic Methodology for Design and Worst-Case Error Analysis of Approximate Array Multipliers

    Takahiro YAMAMOTO  Ittetsu TANIGUCHI  Hiroyuki TOMIYAMA  Shigeru YAMASHITA  Yuko HARA-AZUMI  

     
    LETTER

      Vol:
    E100-A No:7
      Page(s):
    1496-1499

    Approximate computing is considered as a promising approach to design of power- or area-efficient digital circuits. This paper proposes a systematic methodology for design and worst-case accuracy analysis of approximate array multipliers. Our methodology systematically designs a series of approximate array multipliers with different area, delay, power and accuracy characteristics so that an LSI designer can select the one which best fits to the requirements of her/his applications. Our experiments explore the trade-offs among area, delay, power and accuracy of the approximate multipliers.

  • Efficient Realization of an SC Circuit with Feedback and Its Applications Open Access

    Yuto ARIMURA  Shigeru YAMASHITA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2023/10/26
      Vol:
    E107-A No:7
      Page(s):
    958-965

    Stochastic Computing (SC) allows additions and multiplications to be realized with lower power than the conventional binary operations if we admit some errors. However, for many complex functions which cannot be realized by only additions and multiplications, we do not know a generic efficient method to calculate a function by using an SC circuit; it is necessary to realize an SC circuit by using a generic method such as polynomial approximation methods for such a function, which may lose the advantage of SC. Thus, there have been many researches to consider efficient SC realization for specific functions; an efficient SC square root circuit with a feedback circuit was proposed by D. Wu et al. recently. This paper generalizes the SC square root circuit with a feedback circuit; we identify a situation when we can implement a function efficiently by an SC circuit with a feedback circuit. As examples of our generalization, we propose SC circuits to calculate the n-th root calculation and division. We also show our analysis on the accuracy of our SC circuits and the hardware costs; our results show the effectiveness of our method compared to the conventional SC designs; our framework may be able to implement a SC circuit that is better than the existing methods in terms of the hardware cost or the calculation error.

  • Tensor Rank and Strong Quantum Nondeterminism in Multiparty Communication

    Marcos VILLAGRA  Masaki NAKANISHI  Shigeru YAMASHITA  Yasuhiko NAKASHIMA  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:1
      Page(s):
    1-8

    In this paper we study quantum nondeterminism in multiparty communication. There are three (possibly) different types of nondeterminism in quantum computation: i) strong, ii) weak with classical proofs, and iii) weak with quantum proofs. Here we focus on the first one. A strong quantum nondeterministic protocol accepts a correct input with positive probability and rejects an incorrect input with probability 1. In this work we relate strong quantum nondeterministic multiparty communication complexity to the rank of the communication tensor in the Number-On-Forehead and Number-In-Hand models. In particular, by extending the definition proposed by de Wolf to nondeterministic tensor-rank (nrank), we show that for any boolean function f when there is no prior shared entanglement between the players, 1) in the Number-On-Forehead model the cost is upper-bounded by the logarithm of nrank(f); 2) in the Number-In-Hand model the cost is lower-bounded by the logarithm of nrank(f). Furthermore, we show that when the number of players is o(log log n), we have NQP BQP for Number-On-Forehead communication.

  • An Energy-Efficient Patchable Accelerator and Its Design Methods

    Hiroaki YOSHIDA  Masayuki WAKIZAKA  Shigeru YAMASHITA  Masahiro FUJITA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E97-A No:12
      Page(s):
    2507-2517

    With the shorter time-to-market and the rising cost in SoC development, the demand for post-silicon programmability has been increasing. Recently, programmable accelerators have attracted more attention as an enabling solution for post-silicon engineering change. However, programmable accelerators suffers from 5∼10X less energy efficiency than fixed-function accelerators mainly due to their extensive use of memories. This paper proposes a highly energy-efficient accelerator which enables post-silicon engineering change by a control patching mechanism. Then, we propose a patch compilation method from a given pair of an original design and a modified design. We also propose a design method to add redundant wires in advance to decrease the necessary amount of patch memory for post-silicon engineering change. Experimental results demonstrate that the proposed accelerators offer high energy efficiency competitive to fixed-function accelerators and can achieve about 5X higher efficiency than the existing programmable accelerators. We also show the trade-off between redundant wires and the necessary amount of patch memory.

  • An Efficient Method for Finding an Optimal Bi-Decomposition

    Shigeru YAMASHITA  Hiroshi SAWADA  Akira NAGOYA  

     
    PAPER-Logic Synthesis

      Vol:
    E81-A No:12
      Page(s):
    2529-2537

    This paper presents a new efficient method for finding an "optimal" bi-decomposition form of a logic function. A bi-decomposition form of a logic function is the form: f(X) = α(g1(X1), g2(X2)). We call a bi-decomposition form optimal when the total number of variables in X1 and X2 is the smallest among all bi-decomposition forms of f. This meaning of optimal is adequate especially for the synthesis of LUT (Look-Up Table) networks where the number of function inputs is important for the implementation. In our method, we consider only two bi-decomposition forms; (g1 g2) and (g1 g2). We can easily find all the other types of bi-decomposition forms from the above two decomposition forms. Our method efficiently finds one of the existing optimal bi-decomposition forms based on a branch-and-bound algorithm. Moreover, our method can also decompose incompletely specified functions. Experimental results show that we can construct better networks by using optimal bi-decompositions than by using conventional decompositions.

  • SPFD-Based Flexible Transformation of LUT-Based FPGA Circuits

    Katsunori TANAKA  Shigeru YAMASHITA  Yahiko KAMBAYASHI  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E88-A No:4
      Page(s):
    1038-1046

    In this paper, we present the condition for the effective wire addition in Look-Up-Table-based (LUT-based) field programmable gate array (FPGA) circuits, and an optimization procedure utilizing the effective wire addition. Each wire has different characteristics, such as delay and power dissipation. Therefore, the replacement of one critical wire for the circuit performance with many non-critical ones, i.e., many-addition-for-one-removal (m-for-1) is sufficiently useful. However, the conventional logic optimization methods based on sets of pairs of functions to be distinguished (SPFDs) for LUT-based FPGA circuits do not make use of the m-for-1 manipulation, and perform only simple replacement and removal, i.e., the one-addition-for-one-removal (1-for-1) manipulation and the no-addition-for-one-removal (0-for-1) manipulation, respectively. Since each LUT can realize an arbitrary internal function with respect to a specified number of input variables, there is no sufficient condition at the logic design level for simple wire addition. Moreover, in general, simple addition of a wire has no effects for removal of another wire, and it is important to derive the condition for non-simple and effective wire addition. We found the SPFD-based condition that wire addition is likely to make another wire redundant or replaceable, and developed an optimization procedure utilizing this effective wire addition. According to the experimental results, when we focused on the delay reduction of LUT-based FPGA circuits, our method reduced the delay by 24.2% from the initial circuits, while the conventional SPFD-based logic optimization and the enhanced global rewiring reduced it by 14.2% and 18.0%, respectively. Thus, our method presented in this paper is sufficiently practical, and is expected to improve the circuit performance.

  • Clique-Based Architectural Synthesis of Flow-Based Microfluidic Biochips

    Trung Anh DINH  Shigeru YAMASHITA  Tsung-Yi HO  Yuko HARA-AZUMI  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2668-2679

    Microfluidic biochips, also referred to “lab-on-a-chip,” have been recently proposed to integrate all the necessary functions for biochemical analyses. This technology starts a new era of biology science, where a combination of electronic and biology is first introduced. There are several types of microfluidic biochips; among them there has been a great interest in flow-based microfluidic biochips, in which the flows of liquid is manipulated using integrated microvalves. By combining several microvalves, more complex resource units such as micropumps, switches and mixers can be built. For efficient execution, the flows of liquid routes in microfluidic biochips need to be scheduled under some resource constraints and routing constraints. The execution time of a biochemical application depends strongly on the binding and scheduling result. The most previously developed binding and scheduling algorithm is based on heuristics, and there has been no method to obtain optimal results. Considering the above, we propose an optimal method by casting the problem to a clique problem. Moreover, this paper also presents some heuristic techniques for computational time reduction. Experiments demonstrate that the proposed method is able to reduce the execution time of biochemical applications by more than 15% compared with the previous approach. Moreover, the proposed heuristic method is able to produce the results at no or little cost of optimality, in significantly shorter time than the optimal method.

  • Efficient Kernel Generation Based on Implicit Cube Set Representations and Its Applications

    Hiroshi SAWADA  Shigeru YAMASHITA  Akira NAGOYA  

     
    PAPER-Logic Synthesis

      Vol:
    E83-A No:12
      Page(s):
    2513-2519

    This paper presents a new method that efficiently generates all of the kernels of a sum-of-products expression. Its main feature is the memorization of the kernel generation process by using a graph structure and implicit cube set representations. We also show its applications for common logic extraction. Our extraction method produces smaller circuits through several extensions than the extraction method based on two-cube divisors known as best ever.

  • Selective Check of Data-Path for Effective Fault Tolerance

    Tanvir AHMED  Jun YAO  Yuko HARA-AZUMI  Shigeru YAMASHITA  Yasuhiko NAKASHIMA  

     
    PAPER-Design Methodology

      Vol:
    E96-D No:8
      Page(s):
    1592-1601

    Nowadays, fault tolerance has been playing a progressively important role in covering increasing soft/hard error rates in electronic devices that accompany the advances of process technologies. Research shows that wear-out faults have a gradual onset, starting with a timing fault and then eventually leading to a permanent fault. Error detection is thus a required function to maintain execution correctness. Currently, however, many highly dependable methods to cover permanent faults are commonly over-designed by using very frequent checking, due to lack of awareness of the fault possibility in circuits used for the pending executions. In this research, to address the over-checking problem, we introduce a metric for permanent defects, as operation defective probability (ODP), to quantitatively instruct the check operations being placed only at critical positions. By using this selective checking approach, we can achieve a near-100% dependability by having about 53% less check operations, as compared to the ideal reliable method, which performs exhaustive checks to guarantee a zero-error propagation. By this means, we are able to reduce 21.7% power consumption by avoiding the non-critical checking inside the over-designed approach.

21-29hit(29hit)